Build An Agent From Scratch [4]：让 Agent 有记忆 - Build An Agent From Scratch

这是「从零搭建 Agent」系列的第四篇。本章将在前三章构建的 Agent Loop、基础工具箱以及 Context Engine 的基础上，为 Agent 引入记忆系统（Memory System），通过分层记忆架构解决长链路任务中的 "遗忘" 问题。

当任务跨过很多轮、甚至跨过一个新会话时，Agent 应该怎样把重要事实留下来，又怎样在需要的时候把它找回来？

本章对应变更：https://github.com/Tritium0041/Singularity/commit/675e567bed52522ab2ed5385cec14fa1878d5f29 以及所有 stage3 相关

# 为什么做了 Context Engine 之后还需要 Memory 系统？

在前两章中，我们把模型的上下文通过 Context Engine 进行统一管理，分离了模型历史消息和每次请求用到的上下文信息，模型每一轮真正看到的是 Context Engine 处理后的 request view。但是，当任务变得足够长，上下文经历了多次压缩、整理后，某些曾经出现过的事实可能会从模型注意力中消失。这时，就需要一层 Memory 系统，让模型能够主动把关键信息写入非易失存储，并能在需要的时候进行读取和更改。

也就是说，记忆系统解决了这三个问题：

哪些事实值得从短期上下文里提取出来？
这些事实应该存在当前任务里，还是长期保存？（记忆分层）
模型需要时，怎样把这些事实重新取回？

# 将模型的注意力分成三层

如果把所有事实都放进 memory，模型很快就会混乱。当前实现里，我把模型接触到的信息拆成三层。

第一层是当前上下文里的消息、工具调用、工具结果和压缩摘要。它由第二章的 Context Engine 管理。他们组成了 Agent 的短期记忆，目标是提供完成当前步骤所需的信息。

第二层是临时记忆，也就是当前任务生命周期内的 Workspace 笔记。它像一个任务草稿本，用来保存阶段性结论、重要文件、错误记录、待办项和设计决策。Workspace 中的信息不应该因为 history 被压缩就消失，能够在一个 session 中被稳定保存。

第三层是长期记忆，也就是跨任务、跨会话保存的 Memory Store。它适合保存用户偏好、项目约定、可复用经验和长期知识。比如「这个用户喜欢 TypeScript 示例」「这个 repo 的 markdown 成稿默认不纳入 git」「这个项目里 memory 内容必须 tool-first 召回」。

# 记忆工具的设计原则

同类工具的记忆系统中，可以抽出几个共同的设计原则。

第一个原则是读写分离。记忆读取和记忆写入不是一件事。读取发生在普通任务的各个阶段中，它应该轻量、可控、可解释；写入则意味着系统要决定什么值得长期保存，这件事更危险，也更容易污染记忆库。

第二个原则是记忆要有可解释性。在一次记忆召回中，检索为什么命中，返回给模型的具体内容是什么，都需要能够被明确归因，能让人理解模型为什么会这么想。

第三个原则是不要让记忆喧宾夺主。模型有 memory 不代表每一轮都应该看到所有 memory。太多被召回的旧偏好、旧经验、旧项目约定，会挤占短期上下文，也会把注意力从当前任务上拉走。

我们的实现中，也会遵守这几个原则，在保证克制和可解释性的基础上尽量提高记忆的稳定性。

# 记忆系统的整体实现

在这一章中，我们主要做了两组相关工作。

第一组是 Memory System，核心文件都在 src/memory/ ：

	src/memory/
	index.ts
	types.ts
	workspace.ts
	memory-store.ts
	memory-tools.ts
	instructions.ts
	phase-summary.ts

第二组是 demo 的多会话和恢复能力，核心文件在 src/session/ ，并接入到 examples/run-agent.ts ：

	src/session/
	index.ts
	session-store.ts
	session-title.ts

这两组变更放在一起，是因为 Memory 不是一个孤立的工具。中期 Workspace 要跟随当前 Session 保存；长期记忆要独立于会话存在，同时 TUI 需要能展示当前 workspace notes、打开或关闭长期记忆、手动 compact 历史、创建新会话。

最终数据流变成了这样：

	user input
	-> append user message to agent.history
	-> build runtime tool registry
	-> core tools
	-> dynamic compression tool if enabled
	-> memory tools if enabled
	-> build system prompt
	-> default instructions
	-> environment background
	-> static memory instructions
	-> ContextEngine.prepareWithHandoff()
	-> request view
	-> LLM
	-> answer or tool calls
	-> execute tools
	-> memory content returns as tool result when requested
	-> append assistant/tool messages to history
	-> after first completed request, enqueue background session title
	-> session store persists history and Workspace state

session 文件中保存的是当前会话历史、Workspace notes、usage telemetry 和标题。长期记忆默认存在 .agent-memory/MEMORY.md 。

# 中期记忆：Workspace Notes

Workspace 的类型实现很精简：

	export const WORKSPACE_NOTE_KINDS = ["note", "decision", "file", "error", "todo"] as const;

	export type WorkspaceNote = {
	id: string;
	kind: WorkspaceNoteKind;
	content: string;
	createdAt: string;
	updatedAt: string;
	};

	export type WorkspaceState = {
	notes: WorkspaceNote[];
	};

这几个分类对应的是 Agent 在任务中最常需要留下来的东西：

note ：一般事实。
decision ：已经做出的设计选择。
file ：重要文件或读过的入口。
error ：关键错误和失败原因。
todo ：后续要做的事情。

真正管理状态的是 WorkspaceMemory ：

	export class WorkspaceMemory {
	private readonly notes: WorkspaceNote[];

	constructor(initial?: WorkspaceState) {
	this.notes = (initial?.notes ?? []).map(cloneNote);
	for (const note of this.notes) {
	validateKind(note.kind);
	if (note.content.trim() === "") {
	throw new Error("Workspace note content must be non-empty.");
	}
	}
	}

	get state(): WorkspaceState {
	return {
	notes: this.notes.map(cloneNote)
	};
	}

	write(input: { kind?: WorkspaceNoteKind; content: string }): WorkspaceNote;
	read(filter?: { id?: string; kind?: WorkspaceNoteKind }): WorkspaceNote[];
	update(input: { id: string; kind?: WorkspaceNoteKind; content?: string }): WorkspaceNote;
	delete(id: string): boolean;
	clear(): void;
	}

这里有几个看似普通但很重要的细节：

第一， content 会 trim，并且不能为空。记忆系统最怕被空 note、半截 note 和无意义 note 污染。

第二， state 返回的是 clone，不允许外部直接改内部数组。Workspace 是 Agent 的运行状态，不能让调用方拿到引用后绕过校验。

对应的工具会被动态写入系统 instruction 中。 write_note 用来写入当前任务的重要状态， read_note 可以按 id 或 kind 读取， update_workspace 既可以更新 note，也可以删除 note，带一个 delete: true 就是删除。同时，Agent 在任何时候可以调用 list_notes 工具来列出当前所有 notes，并查看他们的开头前 80 个字符。

# 长期记忆：Markdown Memory Store

长期记忆关注的是跨任务、跨会话中，用户始终要求遵循的指示。

参照 Codex，我们实现了一个 MarkdownMemoryStore 。默认存储的记忆路径是：

.agent-memory/MEMORY.md

初始化后的文件长这样：

	# Singularity Memory

	Long-term memory for this local agent. Entries are append-only Markdown blocks.

每条记忆是一个 Markdown 二级标题块：

	## mem_20260613_201530_ab12
	- tags: typescript, preference
	- source: user
	- created_at: 2026-06-13T20:15:30.000Z
	- updated_at: 2026-06-13T20:15:30.000Z

	User prefers TypeScript for scripts.

用 Markdown 格式的好处是能做非常轻量化的实现，解析逻辑简单，不需要数据库迁移。Agent 也可以轻松调用搜索工具，不需要额外负担就能实现记忆召回

MarkdownMemoryStore 暴露了几类方法：

	store(input): Promise<MemoryEntry>
	list(options): Promise<MemoryEntry[]>
	search(query, options): Promise<MemorySearchResult[]>
	clear(): Promise<void>
	update(input): Promise<MemoryEntry>
	upsertByTag(input): Promise<{ entry: MemoryEntry; created: boolean }>

store() 是最基础的写入。它会规范化内容、tags 和 source，然后 append 一个新的 memory block。

list() 可以列出全部条目，也可以按 tag 过滤。

search() 做的是轻量级关键词检索，而不是 embedding 检索。它会：

对 query 做小写和分词。
先按 tag 过滤可选范围。
给整句命中、tag 命中、term 命中不同权重（整句 6 分、tag 精确 4 分、tag 包含 2 分、term 命中 1 分、tag 过滤额外 2 分）。
按 score 和更新时间排序。
返回 snippet。

snippet 会优先从正文里截取命中片段，而不是从 metadata 里截取。否则搜索 typescript 时，很可能只看到 - tags: typescript ，看不到真正的记忆内容。

对于模型层，我们提供了两个对长期记忆的原子操作，专注于存储和召回（store_memory 和 search_memory）。

store_memory 的描述很严格：

	Store durable long-term memory as a local Markdown entry.
	Only save user preferences, project conventions, or reusable lessons.
	Never store secrets, credentials, or one-off temporary facts.

这句话其实就是长期记忆的边界，它不应该保存：

一次性任务细节。
临时搜索结果。
密钥、token、cookie。
可以从当前 repo 重新读出来的普通文件内容。

它应该保存的是未来任务也会用到的东西：

用户偏好。
项目约定。
反复踩坑后的经验。
某个 repo 的稳定操作方式。

search_memory 则是读取入口：

	{
	"query": "TypeScript script preference",
	"tags": ["preference"],
	"maxResults": 3
	}

返回结果会把 entry 的核心字段都带回来：

	{
	"results": [
	{
	"id": "mem_...",
	"content": "User prefers TypeScript for scripts.",
	"tags": ["typescript", "preference"],
	"source": "user",
	"score": 8,
	"matchedTerms": ["typescript"],
	"snippet": "User prefers TypeScript for scripts."
	}
	]
	}

# 静态 Memory Instructions

只有工具还不够，模型需要知道什么时候应该使用这些工具。

所以 Memory System 加了一个稳定 prompt fragment：

	export function buildMemoryInstructions(options: { hasWorkspace: boolean; hasStore: boolean }): PromptFragment \| undefined {
	const lines = ["Memory tools are available. Use them proactively when they can materially improve the answer:"];

	if (options.hasStore) {
	lines.push("- Use search_memory before relying on remembered user preferences, project conventions, or prior solutions.");
	}
	if (options.hasWorkspace) {
	lines.push("- Use list_notes to inspect current task workspace notes before reading full note content.");
	lines.push("- Use read_note when you need the full content of specific workspace notes.");
	lines.push("- Use write_note for important current-task state that should survive context compaction.");
	}
	if (options.hasStore) {
	lines.push("- Use store_memory only for durable preferences, project conventions, or reusable lessons.");
	lines.push("- Never store secrets or one-off temporary facts.");
	}

	lines.push("- Do not assume memory exists. Treat only tool results as retrieved memory evidence.");

	return {
	id: "memory-instructions",
	stable: true,
	content: lines.join("\n")
	};
	}

注意最后一句 Do not assume memory exists. Treat only tool results as retrieved memory evidence.

它是整个 v1 的核心约束。模型不能因为 system prompt 里说「memory tools are available」就假装自己已经知道长期记忆。它必须调用工具，拿到 tool result，然后才能把结果当证据用。

# Memory 和 Context Engine 到底是什么关系？

现在可以回到本章最关键的工程问题：Memory 和 Context Engine 到底是什么关系？

在我们的实现中，Memory 只会作为一组特定的工具调用结果回到模型上下文中。这一设计的好处是尽量克制地控制了模型的注意力，仅有特定需要记忆中事实的时刻，才会召回记忆点。同时，这也让记忆和其他上下文一样，受到 Context Engine 的管理。记忆内容一旦以 tool result 进入 agent.history，它就和 read_file、execute_command、fetch_url 的结果一样，接受 Context Engine 的工具结果截断、token 估算、history compaction 和 dynamic compression。

# Side work：添加了多会话，让记忆有用武之地

在这次变更中，我们同时加了多 session 的支持。session 会把消息记录保存到本地 json 状态文件里，Agent 可以根据状态文件随时恢复到任意 Session 中。

session store 负责：

保存每个 TUI 会话的 messages。
保存 WorkspaceState。
保存 exchangeCount。
保存 usage telemetry。
维护 active session index。
支持 session list、switch、rename、delete。

每个新会话在第一次完成请求后，demo 还会在后台调用 summary model 生成一个短标题：

	[session:title] model=... messages=...
	[session:title] "Memory System Article"; tokens=8 chars=20 ...

这里的判断是 exchangeCount === 0 且标题还是默认的 "Untitled session" ，也就是只有「新会话的第一轮」才会触发。它复用的是 AGENT_SUMMARY_PROVIDER / AGENT_SUMMARY_MODEL 配置的 summary model，作为后台任务运行，不阻塞下一条用户输入。

标题生成的 system prompt 很克制：要求用用户的语言、3 到 8 个词、去掉引号和 Markdown、只返回名字。 normalizeSessionTitle() 还会做一层清洗，去掉前导的 title: 、列表符号、首尾引号和尾部标点，并截断到 80 字符。这个标题能力目的只是让本地会话列表更容易浏览，不会写入长期记忆。

这里可以看到三种持久化的区别：

	agent.history
	-> 当前 Agent 实例里的完整对话历史

	.agent-sessions/
	-> TUI 会话历史 + Workspace notes + usage

	.agent-memory/MEMORY.md
	-> 跨会话长期记忆

# 回到 Harness：记忆不是一个魔法 prompt

到这里，Singularity 的 Harness 已经有两块核心能力。

第一块是 Context Engine。它负责短期上下文管理：

	tool result truncation
	token estimate
	budget decision
	handoff summary
	dynamic compression
	request view

第二块是 Memory System。它负责可持久、可召回的状态：

	Workspace notes
	Markdown long-term memory
	memory tools
	static memory instructions
	session persistence

这两块合起来，Agent 的能力边界发生了变化。

第二章的 Agent 只能「沿着对话历史往前走」。历史里有什么，它就能看到什么；历史太长，它就开始吃力。

第三章的 Agent 能「整理自己当前能看到的东西」。它可以压缩旧历史、保留最近上下文、把长工具结果截断。

第四章之后，Agent 开始能「主动维护状态」。它可以把当前任务的重要事实写入 Workspace，也可以把长期偏好和经验写入 Markdown Store；需要时再通过工具把它们取回。

但这套记忆能力不是魔法 prompt。它更像一组外部状态接口：

	I might need memory.
	-> call search_memory/read_note
	-> inspect tool result
	-> use retrieved evidence
	-> continue the task

这也是我认为 coding agent 里最健康的 memory 形态：不要让模型假装自己永远记得一切，而是给它一个可靠的方式，在需要的时候去查。

# 小结

在这一章中，我们新增了 WorkspaceMemory 作为当前任务的中期记忆；新增了 MarkdownMemoryStore 作为本地长期记忆；新增了 write_note 、 read_note 、 update_workspace 、 store_memory 、 search_memory 五个 memory tools；把静态 memory instructions 接进 PromptBuilder；把 memory tools 接进 Agent runtime registry；又在 demo 里加入了 /memory 、 /notes 、 /forget-notes 、多 session 持久化和后台标题生成。

下一章中，我们会继续实现 Harness 的第三块能力：Planner。有了 Context Engine 和 Memory System 之后，Agent 已经能管理「看什么」和「记什么」。下一步要解决的是「先做什么、后做什么、做到哪里算完成」。也就是任务分解、任务树、计划更新和 Plan-and-Solve。到那一步，Agent 就不只是能循环调用工具，也不只是能记住状态，开始有一个可以持续修正的行动结构，在达到目标之前不会停止行动。

Build An Agent From Scratch